Parallel Multiple Sequence Alignment with Decentralized Cache Support
نویسندگان
چکیده
In this paper we present a new method for aligning large sets of biological sequences. The method performs a sequence alignment in parallel and uses a decentralized cache to store intermediate results. The method allows alignments to be recomputed efficiently when new sequences are added or when alignments of different precisions are requested. Our method can be used to solve important biological problems like the adaptive update of a complete evolution tree when new sequences are added (without recomputing the whole tree). To validate the method, some experiments were performed using up to 512 Small Subunit Ribosomal RNA sequences, which were analyzed with different levels of
منابع مشابه
Cache-Based Parallelization of Multiple Sequence Alignment Problem
In this paper we present new approach to the problem of parallel multiple sequence alignment. The proposed method is based on the application of caching technique and is aimed to solve, with high precision, large alignment instances on the heterogeneous computational clusters. The cache is used to store partial alignment guiding trees which can be reused in future computations, and is applied t...
متن کاملEfficient Cache - oblivious String Algorithms for Bioinformatics ∗
For each of these three problems we present cache-oblivious algorithms that match the best-known time complexity, match or improve the best-known space complexity, and improve significantly over the cache-efficiency of earlier algorithms. We also show that these algorithms are easily parallelizable, and we analyze their parallel performance. We present experimental results that show that all th...
متن کاملAn Application of the ABS LX Algorithm to Multiple Sequence Alignment
We present an application of ABS algorithms for multiple sequence alignment (MSA). The Markov decision process (MDP) based model leads to a linear programming problem (LPP), whose solution is linked to a suggested alignment. The important features of our work include the facility of alignment of multiple sequences simultaneously and no limit for the length of the sequences. Our goal here is to ...
متن کاملTopic 17: High-Performance Bioinformatics
Bioinformatics is the science of managing, mining, and interpreting information from biological sequences and structures. Genome sequencing projects have contributed to an exponential growth in complete and partial sequence databases. Similarly, the rapidly expanding structural genomics initiative aims to catalog the structure-function information for proteins. Advances in technology such as mi...
متن کاملPredicting locality phases for dynamic memory optimization
Dynamic data, cache, and memory adaptation can significantly improve program performance when they are applied on long continuous phases of execution that have dynamic but predictable locality. To support phased-based adaptation, this paper defines the concept of locality phases and describes a four-component analysis technique. Locality-based phase detection uses locality analysis and signal p...
متن کامل